{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# \ud83d\udcdd Exercise M5.02\n",
    "\n",
    "The aim of this exercise is to find out whether a decision tree model is able\n",
    "to extrapolate.\n",
    "\n",
    "By extrapolation, we refer to values predicted by a model outside of the range\n",
    "of feature values seen during the training.\n",
    "\n",
    "We first load the regression data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "penguins = pd.read_csv(\"../datasets/penguins_regression.csv\")\n",
    "\n",
    "feature_name = \"Flipper Length (mm)\"\n",
    "target_name = \"Body Mass (g)\"\n",
    "data_train, target_train = penguins[[feature_name]], penguins[target_name]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"admonition note alert alert-info\">\n",
    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Note</p>\n",
    "<p class=\"last\">If you want a deeper overview regarding this dataset, you can refer to the\n",
    "Appendix - Datasets description section at the end of this MOOC.</p>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, create two models, a linear regression model and a decision tree\n",
    "regression model, and fit them on the training data. Limit the depth at 3\n",
    "levels for the decision tree."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a synthetic dataset containing all possible flipper length from the\n",
    "minimum to the maximum of the training dataset. Get the predictions of each\n",
    "model using this dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a scatter plot containing the training samples and superimpose the\n",
    "predictions of both models on the top."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we check the extrapolation capabilities of each model. Create a dataset\n",
    "containing a broader range of values than your previous dataset, in other\n",
    "words, add values below and above the minimum and the maximum of the flipper\n",
    "length seen during training."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, make predictions with both models on this new interval of data.\n",
    "Repeat the plotting of the previous exercise."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Write your code here."
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "main_language": "python"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}